Fix/array type ip binary#5516
Conversation
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 352e53c.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
b1f89a9 to
b567be6
Compare
PR Reviewer Guide 🔍(Review updated until commit fec82ff)Here are some key observations to aid the review process:
|
b567be6 to
69d24da
Compare
PR Code Suggestions ✨Latest suggestions up to fec82ff Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit b4a67c5
Suggestions up to commit b491034
Suggestions up to commit b491034
Suggestions up to commit 2734707
Suggestions up to commit 69d24da
|
|
Persistent review updated to latest commit 69d24da |
69d24da to
2734707
Compare
|
Persistent review updated to latest commit 2734707 |
|
Please take a look at the CI failures |
2734707 to
dad3901
Compare
dad3901 to
b491034
Compare
|
Persistent review updated to latest commit b491034 |
|
Persistent review updated to latest commit b491034 |
b491034 to
352e53c
Compare
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
352e53c to
b4a67c5
Compare
|
Persistent review updated to latest commit b4a67c5 |
V3 LIST accumulator was stringifying every element via String.valueOf, which conflicted with the ARG0_ARRAY return-type declaration on PPLBuiltinOperators.LIST and threw ClassCastException at runtime (e.g. (Boolean) "true"). Drop the String.valueOf and accumulate raw Object values; update CalciteMultiValueStatsIT numeric/boolean assertions to match the unquoted JSON output. Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
b4a67c5 to
fec82ff
Compare
|
Persistent review updated to latest commit fec82ff |
| TEST_INDEX_DATATYPE_NONNUMERIC)); | ||
| verifySchema(response, schema("bool_list", "array")); | ||
| verifyDataRows(response, rows(List.of("true"))); | ||
| verifyDataRows(response, rows(List.of(true))); |
There was a problem hiding this comment.
@ps48 Do u remember any reasons we enforce return ARRAY(STRING) in PPL?
There was a problem hiding this comment.
This was based on industry standards and workings of multivalue stats commands.
There was a problem hiding this comment.
Can you provide example on this why the return value is always kept string?
There was a problem hiding this comment.
here's the RFC and the decisions we made through the initial implementation #4026
There was a problem hiding this comment.
Thanks @penghuo @ps48 , I fixed it by rewriting the plan which will keep the contract same.
opensearch-project/OpenSearch#21997
Description
Companion fix for the LIST nullability change in opensearch-project/OpenSearch#21997. After that fix,
stats list(<field>)returns HTTP 200 for 10 of 12 element types — butlist(ip_value)andlist(binary_value)still fail withHTTP 400 ExpressionEvaluationException: unsupported object class [B. This PR fixes those two.Root cause
Two layers, each correct in isolation:
PPLBuiltinOperators.LISTwas registered with return typePPLReturnTypes.STRING_ARRAY, which always returnsARRAY<VARCHAR>regardless of input element type — type information is lost.AnalyticsExecutionEngine.toExprValue) checkstype instanceof IpType/BinaryTypeper cell. With the column type collapsed toARRAY<VARCHAR>it never matches, so per-elementbyte[]values fall through toExprValueUtils.fromObjectValue, which has nobyte[]case and throws.Fix
Two small changes:
PPLBuiltinOperators.LIST: switch return-type inference fromSTRING_ARRAYtoARG0_ARRAY, so the column type at the response boundary becomesARRAY<IpType>/ARRAY<BinaryType>(matching the existingTAKEregistration).AnalyticsExecutionEngine.toExprValue: when the column isARRAY<IpType>/ARRAY<BinaryType>, recurse element-wise and apply the same canonical-IP-string / Base64 conversion already used for scalarIpType/BinaryTypecells. Extracted the two byte[] converters into helpers (ipBytesToExprValue,binaryBytesToExprValue) so the scalar and array paths share them.Testing
stats list(ip_value)unsupported object class [B[["127.0.0.1"]]stats list(binary_value)unsupported object class [B[["U29tZSBiaW5hcnkgYmxvYg=="]]stats list(<other 10 types>)Out of scope
VALUESaggregate registration still usesSTRING_ARRAY. Same fix would apply if anyone exercisesvalues(ip_value)/values(binary_value); flagging for follow-up.ListAggFunction.javastill doesString.valueOf(byte[])— latent on the V3 (non-analytics-engine) path. Analytics path uses DataFusion'sLOCAL_ARRAY_AGG_OPinstead, so it doesn't fire here.